Goto

Collaborating Authors

 simple cache model


A Simple Cache Model for Image Recognition

Neural Information Processing Systems

Training large-scale image recognition models is computationally expensive. This raises the question of whether there might be simple ways to improve the test performance of an already trained model without having to re-train or fine-tune it with new data. Here, we show that, surprisingly, this is indeed possible. The key observation we make is that the layers of a deep network close to the output layer contain independent, easily extractable class-relevant information that is not contained in the output layer itself. We propose to extract this extra class-relevant information using a simple key-value cache memory to improve the classification performance of the model at test time.


Reviews: A Simple Cache Model for Image Recognition

Neural Information Processing Systems

This paper presents a cache model to be used in image recognition tasks. The authors argue that class specific information can be retrieved from earlier layers of the network to improve the accuracy of an already trained model, without having to re-train of finetune. This is achieved by extracting and caching the activations of some layers along with the class at training time. At test time a similarity measure is used to calculate how far/close the input is compared to information stored in memory. Experiments show that performance is improved in CIFAR 10/100 and ImageNet.


A Simple Cache Model for Image Recognition

Orhan, Emin

Neural Information Processing Systems

Training large-scale image recognition models is computationally expensive. This raises the question of whether there might be simple ways to improve the test performance of an already trained model without having to re-train or fine-tune it with new data. Here, we show that, surprisingly, this is indeed possible. The key observation we make is that the layers of a deep network close to the output layer contain independent, easily extractable class-relevant information that is not contained in the output layer itself. We propose to extract this extra class-relevant information using a simple key-value cache memory to improve the classification performance of the model at test time.